Throughout the refactoring process, it was essential that we were able to continuously run and test the functionality of the package. But how could we rewrite all of our functions without affecting the functionality of the whole code base?
example = function(arg1, arg2) {
# Messy zone
a_better_name = arg1$mess$ugh
helpful_name = arg2$what$is$this
# Refactor the internals
useful_result = a_better_name + helpful_name
sensible_name_tibble = a_better_name * helpful_name
# Messy Zone
results$some$mess = useful_result
results$another$naff$list = sensible_name_tibble
}Once the inner functions are clean
Jane and I had one session where we didn’t touch any code at all. We talked, doodled and drew diagrams. You might leave a session like this feeling a little deflated that you didn’t achieve anything. However, that session was actually the most valuable. In the following session we made huge progress, because we had already done the hard work of thinking out the design fully. We were able to whizz through the functions, implementing our new design efficiently. We found ourselves constantly referring back to the diagrams to remind ourselves of the design choices we had made.
My favourite tools:
We all know that it’s important to choose good names for parameters and functions. However, in this project, I was surprised just how much of a difference a good name makes. Sometimes, the only thing we would change in a function would be the names. Often a simple rename morphed the unintelligible code in front of me into a clear, readable explanation of the approach.
Things will go wrong
Deleted code
Brackets
You want to know as soon as possible
Ensure that your code is always run-able
Check your code still works by running your tests - {testthat}
Statistical models must ensure numerical results are unaffected by the refactor
CI/CD
The code was initially what I would call “How” programming. The different components of the functions were grouped by how the calculations were computed programmatically rather than why we were calculating them. This made it hard for someone new to the code to understand what each function did.
I’m not an environmental scientist, so I don’t understand all of the science behind Jane’s complex model. However, by asking questions about what she was trying to achieve, we re-grouped the different stages of each function in terms of the science, rather than the implementation. Changing focus of the code to the scientific method made it much clearer to follow.
Define the messy zone
Push the mess up
Start with a blank slate
Take time to design
A good name goes a long way
Test regularly
Why rather than How
Do it with a friend